Skip to content

Add JSON Schema Definition for gen_ai.tool.definitions#3378

Open
Cirilla-zmh wants to merge 11 commits intoopen-telemetry:mainfrom
Cirilla-zmh:minghui/tool_definitions
Open

Add JSON Schema Definition for gen_ai.tool.definitions#3378
Cirilla-zmh wants to merge 11 commits intoopen-telemetry:mainfrom
Cirilla-zmh:minghui/tool_definitions

Conversation

@Cirilla-zmh
Copy link
Copy Markdown
Member

Fixes #2721 #1835

Changes

This PR is a continuation of #2942 #2793. My apologies for accidentally closing the previous one.

Add JSON schema definition for gen_ai.tool.definitions.

Important

Pull requests acceptance are subject to the triage process as described in Issue and PR Triage Management.
PRs that do not follow the guidance above, may be automatically rejected and closed.

Merge requirement checklist

  • CONTRIBUTING.md guidelines followed.
  • Change log entry added, according to the guidelines in When to add a changelog entry.
    • If your PR does not need a change log, start the PR title with [chore]
  • Links to the prototypes or existing instrumentations (when adding or changing conventions)

Change-Id: I63f11ca8081f71b55b793ec88f35ef64bbace8e6
Co-developed-by: Cursor <noreply@cursor.com>
Change-Id: Ib6133b3019195e4fa9a54d329ff4b891a281d208
Co-developed-by: Cursor <noreply@cursor.com>
Change-Id: I5803322306a93b14f83b858b0deb632074e1d9c0
Co-developed-by: Cursor <noreply@cursor.com>
@Cirilla-zmh Cirilla-zmh requested review from a team as code owners February 3, 2026 06:36
@github-actions github-actions bot added enhancement New feature or request area:gen-ai labels Feb 3, 2026
@Cirilla-zmh Cirilla-zmh moved this from Untriaged to Needs More Approval in Semantic Conventions Triage Feb 3, 2026
@Cirilla-zmh
Copy link
Copy Markdown
Member Author

Cirilla-zmh commented Feb 3, 2026

I apologize that the previous PR (#2942) was closed due to an incorrect rebase. All comments from the original PR have been addressed, and I believe this PR is now ready to be merged.

Here is the change that I have made: 6838193
cc @lmolkova @DylanRussell @gyliu513 @alexmojaki

Copy link
Copy Markdown

@KalleOlaviNiemitalo KalleOlaviNiemitalo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are just my observations. I am not requesting any change.

@Cirilla-zmh Cirilla-zmh moved this from Needs More Approval to Ready to be Merged in Semantic Conventions Triage Feb 5, 2026
@lmolkova lmolkova moved this from Ready to be Merged to Needs More Approval in Semantic Conventions Triage Feb 5, 2026
@DylanRussell
Copy link
Copy Markdown

A few thoughts:

https://github.com/open-telemetry/opentelemetry-python-contrib/pull/4142/changes -- this shows how the GCP GenAi instrumentation records tool definitions currently... Seems we always have a name and description field. We didn't put the function params in, not sure why that is.. Should we add a name field to FunctionToolDefinition ?

Should we just put tool definitions behind the content capture flag (OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT) -- we did for Google genAI instrumentation.. Then we don't need to put function params behind an additional flag probably... See https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-spans/#full-buffered-content for context on this flag..

Should we have a schema specific to the Mcp.McpTool type ?

@Cirilla-zmh
Copy link
Copy Markdown
Member Author

Should we add a name field to FunctionToolDefinition ?

Actually, we've already done that. FunctionToolDefinition inherits from GenericToolDefinition, so it has a name field.

Should we just put tool definitions behind the content capture flag (OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT) -- we did for Google genAI instrumentation.. Then we don't need to put function params behind an additional flag probably...

I believe we should follow the content capture flag, but the behaviors will be somewhat different with chat messages. Please see the description of gen_ai.tool.definitions -- Since this attribute could be large, it's NOT RECOMMENDED to populate non-required properties by default. Instrumentations MAY provide a way to enable populating optional properties.

That's to say, instrumentations should always capture type and name. Once content capture is enabled, other optional properties such as description, parameters should get captured.

Should we have a schema specific to the Mcp.McpTool type ?

Of course! This is an initial PR, and I want to keep the scope limited. I'd rather not include additional definitions like Built-in or MCP Tools yet; we can follow up on those in a separate issue.

@DylanRussell
Copy link
Copy Markdown

Ok mostly SGTM..

That's to say, instrumentations should always capture type and name. Once content capture is enabled, other optional properties such as description, parameters should get captured.

Do we want to explicitly recommend instrumentations reuse the content capture flag for this purpose ? The language you have now doesn't really do that..

Also do we want a optional response type on the FunctionToolDefinition schema ? Gemini allows this: https://docs.cloud.google.com/vertex-ai/generative-ai/docs/reference/rest/v1beta1/FunctionDeclaration

@Cirilla-zmh
Copy link
Copy Markdown
Member Author

Do we want to explicitly recommend instrumentations reuse the content capture flag for this purpose ? The language you have now doesn't really do that..

For the implementation, I believe we should reuse the content capture flag. Do you think we should add more relevant details to this description?

1. [Default] Don't record instructions, inputs, or outputs.
2. Record instructions, inputs, and outputs on the GenAI spans using corresponding
attributes (`gen_ai.system_instructions`, `gen_ai.input.messages`,
`gen_ai.output.messages`).
This approach is best suited for situations where telemetry volume is manageable
and either privacy regulations do not apply or the telemetry storage complies
with them, for example, in pre-production environments.
See [Recording content on attributes](#recording-content-on-attributes)
section for more details.
3. Store content externally and record references on the spans.
This pattern is recommended in production environments where telemetry volume
is a concern or sensitive data needs to be handled securely. Using external
storage enables separate access controls.
See [Uploading content to external storage](#uploading-content-to-external-storage)
section for more details.

Also do we want a optional response type on the FunctionToolDefinition schema ? Gemini allows this: https://docs.cloud.google.com/vertex-ai/generative-ai/docs/reference/rest/v1beta1/FunctionDeclaration

Good point. @lmolkova and I have discussed this before and we decided to ignore this field for now, considering that most providers other than Gemini do not offer this definition. See #2942 (comment)

@github-actions
Copy link
Copy Markdown

This PR has been labeled as stale due to lack of activity. It will be automatically closed if there is no further activity over the next 7 days.

@github-actions github-actions bot added the Stale label Mar 17, 2026
Change-Id: Id5bd556547ef2339e3851c6397ea55e8ea446396
Co-developed-by: Cursor <noreply@cursor.com>
@Cirilla-zmh
Copy link
Copy Markdown
Member Author

@DylanRussell @aabmass @wikaaaaa
Hi, I’ve already resolved your comments. Please review them again when you have time. Thank you!

@github-actions github-actions bot removed the Stale label Mar 23, 2026
Change-Id: I5b9bb8e5604df42d7c66a031b1bd054cc27af150
Co-developed-by: Cursor <noreply@cursor.com>
@Cirilla-zmh Cirilla-zmh requested a review from singankit April 1, 2026 02:44
Copy link
Copy Markdown
Member

@aabmass aabmass left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, @DylanRussell can you take one more look?

@DylanRussell
Copy link
Copy Markdown

LGTM too

@trask trask requested review from Copilot and removed request for KalleOlaviNiemitalo and wikaaaaa April 2, 2026 19:04
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a dedicated JSON Schema for the gen_ai.tool.definitions attribute and updates the GenAI semantic convention documentation to reference and exemplify the schema across provider pages and non-normative materials.

Changes:

  • Introduces docs/gen-ai/gen-ai-tool-definitions.json JSON schema for gen_ai.tool.definitions.
  • Updates the GenAI registry model and multiple docs pages to require the schema and clarify span vs event recording formats.
  • Expands non-normative examples/models to include gen_ai.tool.definitions and demonstrate “simplified” recording when content capture is disabled.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
model/gen-ai/registry.yaml Updates the normative registry entry for gen_ai.tool.definitions to reference the new JSON schema and recording requirements.
docs/registry/attributes/gen-ai.md Updates the attribute registry documentation to reference the new JSON schema and recording requirements.
docs/gen-ai/openai.md Updates OpenAI-specific documentation to reference the new tool definitions schema.
docs/gen-ai/anthropic.md Updates Anthropic-specific documentation to reference the new tool definitions schema.
docs/gen-ai/aws-bedrock.md Updates AWS Bedrock-specific documentation to reference the new tool definitions schema.
docs/gen-ai/azure-ai-inference.md Updates Azure AI Inference-specific documentation to reference the new tool definitions schema.
docs/gen-ai/gen-ai-spans.md Updates span documentation for gen_ai.tool.definitions to reference the new schema and recording requirements.
docs/gen-ai/gen-ai-events.md Updates event documentation for gen_ai.tool.definitions to reference the new schema and recording requirements.
docs/gen-ai/gen-ai-agent-spans.md Updates agent span documentation for gen_ai.tool.definitions to reference the new schema and recording requirements.
docs/gen-ai/non-normative/models.ipynb Adds non-normative Pydantic models and schema-generation snippet for gen_ai.tool.definitions.
docs/gen-ai/non-normative/examples-llm-calls.md Adds gen_ai.tool.definitions to examples, including simplified recording when content capture is disabled.
docs/gen-ai/gen-ai-tool-definitions.json New JSON schema artifact for tool definitions.
.chloggen/schema_of_gen_ai_tool_definitions.yaml Adds a changelog entry describing the enhancement.

Change-Id: Ie5680c746199fadbddee38eaa212d511f3713f29
Co-developed-by: Cursor <noreply@cursor.com>
@Cirilla-zmh Cirilla-zmh requested a review from trask April 6, 2026 13:00
@trask
Copy link
Copy Markdown
Member

trask commented Apr 7, 2026

thanks @Cirilla-zmh! I've posted a prototype for this at trask/genai-otel-conformance#146

Change-Id: I58e6a03cfade2e21142bd19b7ccdcbc17530c308
Co-developed-by: Cursor <noreply@cursor.com>
Change-Id: I2d2a610a1daa38e12c3e7e996ad0a5ad9e82c63b
Co-developed-by: Cursor <noreply@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:gen-ai enhancement New feature or request

Projects

Status: Needs More Approval

Development

Successfully merging this pull request may close these issues.

Define schema for Tool Definitions for Single and Multi-Agent Spans

10 participants